AITopics | upper envelope

Collaborating Authors

upper envelope

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

d55cbf210f175f4a37916eafe6c04f0d-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 14:12:38 GMT

algorithm, bail, batch, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Arizona > Maricopa County > Phoenix (0.04)
North America > Canada (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

d55cbf210f175f4a37916eafe6c04f0d-Supplemental.pdf

Neural Information Processing SystemsAug-16-2025, 15:29:10 GMT

algorithm, batch 1, upper envelope, (14 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

d55cbf210f175f4a37916eafe6c04f0d-Paper.pdf

Neural Information Processing SystemsAug-16-2025, 15:29:02 GMT

algorithm, bail, batch, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > United States > Arizona > Maricopa County > Phoenix (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

d55cbf210f175f4a37916eafe6c04f0d-AuthorFeedback.pdf

Neural Information Processing SystemsAug-16-2025, 15:28:50 GMT

algorithm, bail, batch, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)

Add feedback

Review for NeurIPS paper: BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning

Neural Information Processing SystemsFeb-6-2025, 17:53:32 GMT

Summary and Contributions: ---post author response--- Thank you for the response! The clarifications to the table have improved my understanding of the results. While I think that the results are strong, the discussion section is jumbled/unclear, and intuition of some of the design decisions are lacking and give an'ad hoc' impression. Clarifications for this are adequately mentioned in the response, and I will increase my score to a 6 assuming the authors will add these clarifications to the final text, as well as make the experimental results section more more clear. This work proposes a batch deep RL algorithm called BAIL. It essentially trains a policy using imitation learning with samples collected from state-action pairs whose (Monte Carlo) returns are from what the authors define as the upper envelope of the data.

batch deep reinforcement learning, best-action imitation learning, learning, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Rejection sampling from shape-constrained distributions in sublinear time

Chewi, Sinho, Gerber, Patrik, Lu, Chen, Gouic, Thibaut Le, Rigollet, Philippe

arXiv.org Machine LearningMay-28-2021

We consider the task of generating exact samples from a target distribution, known up to normalization, over a finite alphabet. The classical algorithm for this task is rejection sampling, and although it has been used in practice for decades, there is surprisingly little study of its fundamental limitations. In this work, we study the query complexity of rejection sampling in a minimax framework for various classes of discrete distributions. Our results provide new algorithms for sampling whose complexity scales sublinearly with the alphabet size. When applied to adversarial bandits, we show that a slight modification of the Exp3 algorithm reduces the per-iteration complexity from $\mathcal O(K)$ to $\mathcal O(\log^2 K)$, where $K$ is the number of arms.

algorithm, complexity, rejection, (14 more...)

arXiv.org Machine Learning

2105.14166

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Industry: Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.34)

Add feedback

Enhancing Certified Robustness of Smoothed Classifiers via Weighted Model Ensembling

Liu, Chizhou, Feng, Yunzhen, Wang, Ranran, Dong, Bin

arXiv.org Machine LearningJun-8-2020

Randomized smoothing has achieved state-of-the-art certified robustness against $l_2$-norm adversarial attacks. However, it is not wholly resolved on how to find the optimal base classifier for randomized smoothing. In this work, we employ a Smoothed WEighted ENsembling (SWEEN) scheme to improve the performance of randomized smoothed classifiers. We theoretically analyze the expressive power of the SWEEN function class and show that SWEEN can be trained to achieve near-optimal risk in the randomized smoothing regime. We also develop an adaptive prediction algorithm to reduce the prediction and certification cost of SWEEN models. Extensive experiments show that SWEEN models outperform the upper envelope of their corresponding candidate models by a large margin. Moreover, SWEEN models constructed using a few small models can achieve comparable performance to a single large model with a notable reduction in training time.

artificial intelligence, candidate model, machine learning, (14 more...)

arXiv.org Machine Learning

2005.09363

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning

Chen, Xinyue, Zhou, Zijian, Wang, Zheng, Wang, Che, Wu, Yanqiu, Deng, Qing, Ross, Keith

arXiv.org Artificial IntelligenceOct-27-2019

The field of Deep Reinforcement Learning (DRL) has recently seen a surge in research in batch reinforcement learning, which aims for sample-efficient learning from a given data set without additional interactions with the environment. In the batch DRL setting, commonly employed off-policy DRL algorithms can perform poorly and sometimes even fail to learn altogether. In this paper, we propose a new algorithm, Best-Action Imitation Learning (BAIL), which unlike many off-policy DRL algorithms does not involve maximizing Q functions over the action space. Striving for simplicity as well as performance, BAIL first selects from the batch the actions it believes to be high-performing actions for their corresponding states; it then uses those state-action pairs to train a policy network using imitation learning. Although BAIL is simple, we demonstrate that BAIL achieves state of the art performance on the Mujoco benchmark.

algorithm, batch, upper envelope, (10 more...)

arXiv.org Artificial Intelligence

1910.12179

Country: Asia > China > Shanghai > Shanghai (0.05)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Guaranteed bounds on the Kullback-Leibler divergence of univariate mixtures using piecewise log-sum-exp inequalities

Nielsen, Frank, Sun, Ke

arXiv.org Machine LearningAug-16-2016

Information-theoretic measures such as the entropy, cross-entropy and the Kullback-Leibler divergence between two mixture models is a core primitive in many signal processing tasks. Since the Kullback-Leibler divergence of mixtures provably does not admit a closed-form formula, it is in practice either estimated using costly Monte-Carlo stochastic integration, approximated, or bounded using various techniques. We present a fast and generic method that builds algorithmically closed-form lower and upper bounds on the entropy, the cross-entropy and the Kullback-Leibler divergence of mixtures. We illustrate the versatile method by reporting on our experiments for approximating the Kullback-Leibler divergence between univariate exponential mixtures, Gaussian mixtures, Rayleigh mixtures, and Gamma mixtures.

artificial intelligence, divergence, entropy, (17 more...)

arXiv.org Machine Learning

doi: 10.3390/e18120442

1606.0585

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback